Large-Scale Knowledge Acquisition from Botanical Texts

نویسندگان

  • François Role
  • Milagros Fernández Gavilanes
  • Éric Villemonte de la Clergerie
چکیده

Free text botanical descriptions contained in printed floras can provide a wealth of valuable scientific information. In spite of this richness, these texts have seldom been analyzed on a large scale using NLP techniques. To fill this gap, we describe how we managed to extract a set of terminological resources by parsing a large corpus of botanical texts. The tools and techniques used are presented as well as the rationale for favoring a deep parsing approach coupled with error mining methods over a simple pattern matching approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Knowledge Acquisition by Semantic Analysis and Assimilation of Textual Information

Automatic knowledge acquisition is one of the bottlenecks in artificial intelligence and large-scale applications of natural language processing (NLP). There are many efforts to create large knowledge bases (KBs) or to automatically derive knowledge from large text corpora. On the one hand, we meet KBs like CYC, where a tremendous amount of work has been invested by knowledge enterers who have ...

متن کامل

The Cognitive and Social Grounding of Large-Scale Knowledge Resources

We describe the general approach of a sub-project seeking to develop cognitively and socially adequate knowledge resources. Specifically, the present paper outlines a text file acquisition system that (a) allows any users to submit their digitized versions of literary texts, (b) improve their contributions at any later opportunity, and (c) encourages all users to evaluate contributed text files...

متن کامل

Kleo: A Bootstrapping Learning-by-Reading System

KLEO is a bootstrapping learning-by-reading system that builds a knowledge base in a fully automated way by reading texts for a domain. KLEO’s initial knowledge base is a small knowledge base that consists of domain independent knowledge and KLEO expands the knowledge base with the information extracted from texts. A key facility in KLEO is knowledge integration which combines new information g...

متن کامل

Integrating Natural Language, Knowledge Representation and Reasoning, and Analogical Processing to Learn by Reading

•radically change the economics of building large knowledge bases •provide a platform for cognitive simulations of larger-scale phenomena •Learning Reader learns by reading simplified language texts •Manages syntactic complexity •Unconstrained vocabulary, unlike controlled languages •Learning Reader combines •Natural language processing •A large-scale knowledge base •Deductive reasoning •Analog...

متن کامل

Combining NLP and statistical techniques for lexical acquisition

The growing availability of large on-line corpora encourages the study of word behaviour directly from accessible raw texts. However the methods by which lexical knowledge should be extracted from plain texts are still matter of debate and experimentation. In this paper it is presented an integrated tool for lexical acquisition from corpora, ARIOSTO, based on a hybrid methodology that combines ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007